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Abstract 


Given  known  Item  parameters,  unbiased  estimators  are  derived 
1)  for  an  examinee's  ability  parameter  6  and  for  his  proportion- 


correct  true  score  £  ,  2)  for  s  the  variance  of  9  across  examinees 

0 

2 

in  the  group  tested,  also  for  s^  ,  and  3)  for  the  parallel-forms 
reliability  of  the  observed  test  score,  the  maximum  likelihood  estimator. 
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Unbiased  Estimators  of  Ability  Parameters,  of  Their  Variance, 
and  of  Their  Parallel-Forms  Reliability 


”*rhis  paper  is  primarily  concerned  with  determining  the  statistical 
bias  in  the  maximum  likelihood  estimate  of  the  examinee  ability  ‘ 
parameter  15)  in  item  response  theory  (IRT)  [Lord,  1980];  also  of 
certain  functions  of  such  parameters.  We  will  deal  only  with  uni¬ 
dimensional  tests  composed  of  dichotomously  scored  items.  We  assume 
the  item  response  function  is  three-parameter  logistic  (2), 

Available  results  for  the  sampling  variance  of  (^J)  are  currently 
limited  to  the  case  where  the  item  parameters  are  known;  the  present 
derivations  are  limited  to  this  case  also.  This  limitation  is  tolerable 
in  situations  where  the  item  parameters  are  predetermined,  as  in  item 
banking  and  tailored  testing.  />✓.. 

In  the  absence  of  a  prior  distribution  for  0  ,  it  is  well  known 

that  examinees  with  perfect  scores  have  VJ3  =  »  ^  also  that  examinees 

(  -  in/",,-); 

who  perform  near  or  below  the  chance  level  on  multiple- choice  items 
may  be  gr._n  large  negative  values  of  J \J .  This  (correctly)  suggests 
that  0  is  positively  biased  for  high-ability  examinees  and  negatively 
biased  for  low-ability  examinees.  Will  a  correction  of  Qj  for  bias 
be  helpful  in  such  cases?  _  — 


*This  work  was  supported  in  part  by  contract  N00014-80-C-0402, 
project  designation  NR  150-453  between  the  Office  of  Naval  Research  and 
Educational  Testing  Service.  Reproduction  in  whole  or  in  part  is  permitted 
for  any  purpose  of  the  United  States  Government. 
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It  is  also  'well  known'  that  for  any  ordinary  group  of  examinees , 

2 

the  variance  (  s*  )  of  0  across  examinees  is  larger  than  the  variance 
0 

2  2  2 

(  s.  )  of  the  true  0  .  The  ratio  s*/s  is  closely  related  to  the 
0  0  0 

classical-test-theory  reliability  of  0  considered  as  the  examinee's 

2  2 

test  score.  Thus  it  i3  not  enough  for  us  to  know  that  Sg  sQ  as  the 

number  n  of  test  items  becomes  large;  we  need  to  know  how  the  rela- 
2  2 

tion  of  s~  to  s.  varies  as  a  function  of  n  .  We  also  need  a 
0  0 

2  2 
better  estimate  of  s^  than  its  maximum-likelihood  estimator  sg  . 

2 

These  objectives  can  be  achieved  by  correcting  sg  for  bias. 

-„.T^The  methods  used  to  derive  formulas  for  correction  for  bias  are 
presented  here*in  detail  for  at  least  two  reasons:  1)  experience 
with  similar  de^yations  has  shown  that  it  is  easy  to  reach  erroneous 
results  if  details  are  not  spelled  out.  2)  The  general  methods  used 
here  are  easily  transferred  to  solve  other  problems,  such  as  a)  cor¬ 
rection  of  Item  parameters  for  bias,  b)  obtaining  higher-order  approxima¬ 
tions  to  the  sampling  varian:e  of  0  . 


1.  Statistical  Bias  in  0  and 


The  method  used  here  to  find  the  bias  of  0  is  adapted  from  the 
'adjusted  order  of  magnitude'  procedure  detailed  by  Shenton  and  Bowman 
(1977).  They  assume  their  data  to  be  a  sample  from  a  population  divided  into 
a  denumerable  number  of  subsets.  For  them,  the  population  proportion 
of  observations  in  a  given  subset  is  a  known  function  of  the  param¬ 
eter  6  whose  value  they  wish  to  estimate.  Their  sample  estimate  of 
0  is  therefore  a  function  of  observed  sample  proportions  in  the 
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various  subsets-  Since  our  data  do  not  readily  fit  this  picture,  we 
cannot  u$a  their  final  published  formulas  but  must  instead  derive  our 
own. 

Throughout  Section  l,  we  deal  with  a  single  fixed  examinee  whose 
ability  g  is  the  parameter  to  be  estimated.  All  item  parameters  are 
assumed  kn°vm- 

1.1  Preliminaries 

a 

The  maximum  likelihood  estimate  3  is  obtained  by  solving  the  like- 
] ihood  equation 

E  (u,  -  P.)P!/P,Q,  -  0  (1) 

i»l 


where  u^  *■  0  or  1  is  the  examinee's  response  to  item  i  (  i  -  1,2 . 

n  ),  S  ^(6)  is  the  response  function  for  item  1  ,  •  1  •  P  , 

is  the  derivative  of  with  respect  to  9  ,  and  a  caret  indicates 

A 

that  the  function  is  to  be  evaluated  at  0  .  We  deal  with  the  case 
wherfc  is  the  three-parameter  logistic  function 
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Me  will  assume 

1.  8  Is  a  bounded  variable , 

2.  the  item  parameters  a^  and  b^  are  bounded, 

3.  is  bounded  away  from  1, 

(thus  P  and  Q,  are  bounded  away  from  0  and  1); 

4.  as  n  becomes  large,  the  statistical  characteristics  of  the 
test  stabilize. 

Rather  than  trying  to  define  this  last  assumption  formally,  the  reader 
may  substitute  the  more  restrictive  assumption  usually  made  in  mental 
test  theory:  that  a  test  is  lengthened  by  adding  strictly  parallel  forms. 
With  these  assumptions,  the  conditions  of  Bradley  and  Gart  (1962) 

A 

are  satisfied.  It  follows  from  their  theorems  that  6  is  a  consistent 
estimator  of  0  and  that  ^n  (0  -  9)  is  asymptotically  normally  distri¬ 
buted  with  mean  zero  and  variance  lim  ^  z”p'^/p  Q  .  The  existence  of 

n-«o 

this  limit  is  guaranteed  by  assumpt'on  4. 

For  compactness,  we  will  rewrite  (1)  as 

n  A 

L,  5  E  I\.  -  0  ,  (3) 

i«l 


where  by  definition 
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found  that 


Unbiased  Estimator 


(15) 
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We  will  denote  the  Fisher  Information  by 

I  S  -^(d^/de)  -  -ny2  -  E  P^2/!^  • 


(16) 


Setting  (6)  equal  to  zero,  the  likelihood  equation  can  now  be  written 

In  terms  of  the  y  and  the  e  as 
s  s 


•£l  ■  x(t2 


I2)  +  y  x2(y3  +  Ej)  +  i  x3(Y4  +  V  t  24  *  '5 


.)  +  4  x‘T, 


We  will  need  some  Information  about  the  order  of  magnitude  of  the 
terms  such  as  those  in  (17).  It  may  be  seen  from  (7)  that  each  e 

s 

has  the  form 


i  l  "siS  -  V 


where  Kgi  does  not  depend  on  n  or  on  u^  .  Since  and 

1  -  c^  are  bounded,  the  Kg^  and  thus  tg  is  bounded.  3y  assumption 
(A) ,  the  bound  does  not  depend  on  n  .  The  same  conclusion  holds  for  y 

s 

Since  </n  x  is  asymptotically  normally  distributed  with  zero  mean 
and  finite  variance,  it  follows  that  SxT  (  r  ■  1,2,...  )  is  of  order 

n  r>/2  .  A  similar  statement  is  true  of  Ae  .  Thus  finally  Sx* et  <_ 

tc  2r^2t.l/2  .  r  t  ,  .  — (r+t)/2  ,  ,  „  . 

( 6x  6  )  so  that  6x  c  is  of  order  n  v  1  (  r,t  ■  1,2,...  ). 

s  s 


(17) 
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1.3  First-Order  Variance  of  6 

To  clarify  the  procedure,  let  us  derive  from  (17)  the  familiar 

A 

formula  for  the  asymptotic  variance  of  6  .  Square  (17)  and  take 
expectations  to  obtain 

^  ■  Y2  <  +  2  ^  g  x2  Ej  +  $  x^:  2  +  Y2  S  x3  +  y2  StjX3  +  . . . 

If  we  wish  to  neglect  terms  o(n  3)  (of  higher  order  than  n  3) . 
equation  (18)  becomes 


Sx2  -  +  o(n-1) 


By  (13)  and  (16),  because  of  local  independence. 


,  2  .1 
St.  -  S-»  l 


p! 


1  V  1 PA  (u‘  • P‘>  J  VJ 


(u,  '  P.) 


pip* 

7!5#<(u‘-V(vV 


1  P' 2 
~~2  E  Var  u 

"  1  piQi 


p'2 

-J,  E  _jL 

n2  PiQi 


(19) 


(18) 


(20) 
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Thus,  finally 


Var  0 


+  o(n-1) 


(21) 


a  well-known  result.  It  is  derived  here  to  clarify  the  reasoning  to 

A 

be  used  subsequently.  If  8  is  substituted  for  8  on  the  right  side 
of  (21),  the  formula  will  still  be  correct  to  the  specified  order  of 
approximation. 


1.4  Statistical  Bias  of  6 

Take  the  expectation  of  (17)  to  obtain 


1  2 

~^1E1  =  Y2“lX  +  ^1XG2  +  1  ^3^1X 


(22) 


where  6 indicates  an  expectation  in  which  only  terms  of  order  n  are  to 
be  retained;  Also  multiply  (17)  by  and  take  expectations  to  obtain 


*Vle2 


VlXE2 


(23) 


By  (9) 


6 e  "0  r  ■  1,2, . . . 
r 


(24) 


Frotu  (13)  and  (14) 
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i  A.c,  K2 
n  1  1  piQi 


V' 


,  Aici  p'2 

7i  ^?v[ 


(25) 


Substituting  (l 6 )  find  (25)  into  (23),  we  have  the  covariance 


V“2 


i.  Vi  n2 

nI  i  (T^7  p2 


(26) 


Finally,  substituting  (16),  (21),  (24),  and  (26)  into  (22)  and  solving 
for  »  we  have  the  bias 


B^e)  =  g  (0  -  6) 


Ac  pi  2 

_JL  ,  j  Ti  Pi  ^  1  , 

2  (  1  T-  c  I-  +  7  rrY1) 

I  i  1  ci  p2  23 


(27) 


This  may  be  rewritten  as 


bi(5)  ■  7  ViS  -  5  > 


(28) 


where 


d,  -  ?1  ~  C1 
^i  “  1  -  Cj 


,.2 


and  IJ  = 


-  i 


i  - 


(29) 
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a  -1 

Since  I  is  of  order  n  ,  B^(0)  is  of  order  n  .  It  may  be  of 
interest  to  note  that  in  the  special  case  where  all  items  are  equivalent 
(all  are  the  same),  the  bias  simplifies  to  B^(0)  -  P/nP*  . 

1.5  Numerical  Results 

A  hypothetical  test  was  designed  to  approximate  the  College 
Entrance  Examination  Board’s  Scholastic  Aptitude  Test,  Verbal  Section. 

This  test  is  composed  of  n  ■  90  five-choice  items.  Some  information 
about  the  distributions  of  the  parameters  of  the  90  hypothetical  items 
is  given  in  Table  1. 

A 

The  standard  error  and  bias  of  6  were  computed  from  (21)  and 
from  (27)  respectively  for  various  values  of  0  .  The  results  are 
shown  in  Table  2.  It  appears  that  the  bias  in  0  is  negligible  for 
moderate  values  of  0  ,  but  is  sizable  for  extreme  values.  Note  that 
the  bias  is  positively  correlated  with  0  .  Because  of  guessing,  zero 
bias  does  not  occur  at  3*0  but  at  0  ■  .34  approximately. 

1.6  Variance  and  Bias  of  Estimated  True  Score 

Since  the  ability  scale  is  not  unique,  any  monotonic  transformation 

of  0  can  serve  as  a  measure  of  ability.  Two  transformations  are 

0 

particularly  useful:  e  and 

1  n 

I  P.(0)  .  (30) 

i-1  1 

the  proportion-correct  true  score  (the  number-right  true  score  divided 
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TABLE  1 

Range  and  Quartiles  of  the  Item  Parameters 
in  90-Item  Hypothetical  Test 


ai»Ai/1.7 

ii 

fi 

Highest  value 

1.88 

2.32 

.47 

1.07 

1.15 

.20 

Median 

.83 

.38 

.15 

Q3 

.69 

-.41 

.13 

Lowest  value 

.41 

-3.94 

.01 
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by  the  number  of  items).  One  important  reason  for  using  the  latter 
transformation  is  the  following. 

Ordinarily,  as  in  Table  2,  we  find  large  standard  errors  of 
6  where  6  is  extreme.  Usually  these  large  standard  errors  are  no 
more  harmful  to  the  user  than  are  the  smaller  standard  errors  found 
when  e  is  near  the  level  aimed  at  by  the  test.  There  is  a  reason 
why  this  is  so:  If  it  were  not,  the  user  should  have  designed  his  test 
so  as  to  reduce  those  standard  errors  that  were  troublesome  to  him. 

We  see  that  from  this  point  of  view  the  size  of  a  difference  on 
the  0  scale  does  not  correspond  to  its  importance.  The  discrepancy 
is  greatly  reduced,  however,  if  we  measure  ability  on  the  5  scale 
instead  of  on  the  0  scale.  This  is  one  reason,  among  several,  wny 
we  are  interested  in  the  variance  and  bias  of 

C  z  EiP1(6)/n  .  (31) 

Although  the  proportion-correct  true  score 

z  =  EjU^/n  (32) 

is  an  unbiased  estimator  of  £  ,  z  is  never  a  fully  efficient 
estimator  of  e  unless  c^  -  0  and  a^  ■  a^  (  i,j  »  l,2,...,n  ): 
the  sampling  variance 


Var  z 


(33) 
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TABLE  2 

A 

Standard  Error  and  Statistical  Bias  in  8 


A 

0 

/"  Var  8 

Mil 

3.5 

.60 

.24 

3.0 

.43 

.12 

2.5 

.31 

.06 

2.0 

.23 

.032 

1.5 

.19 

.011 

1.0 

.19 

.0032 

0.5 

.20 

.0012 

0 

.22 

-.0028 

-0.5 

.25 

-.010 

-1.0 

.31 

-.025 

-1.5 

.41 

-.05 

-2.0 

.54 

-.09 

-2.5 

.70 

-.14 

-3.0 

o\ 

00 

• 

-.22 

-3.5 


1.09 


-.31 
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is  not  as  small  as  the  sampling  variance  of  i  ,  which  we  must  now 
derive. 

By  (31) 


Var  £  *  —4  (  i  p>)2  var  9 
n  ->-1  1 

By  (21)  and  (16) 
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where 


P'l  =  d2Pi/de2  . 

Taking  expectations,  and  neglecting  higher-order  terms,  we  have  for 
the  bias 


B  (?)  =  6(1  -  ?)  «  ~  (B<0)  IP’  +  |  Var  0(EPp]  .  (37) 

This  can  be  rewritten  as 


(  Z 


v^2 

(1  -  c L»l 


(38) 


where  ?’  =  E^^/n  an<*  ?"  =  Z^'^/n  .  Let  us  note  Is  passing  that  when 
all  items  are  equivalent  (all  P± (0 )  are  the  same),  ?  «  z  and  its 
bias  (38)  Is  zero. 

1.7  Numerical  Results 

Table  3  shows  the  bias  in  ?  for  the  same  hypothetical  test  con¬ 
sidered  in  Section  1.5.  The  biases  are  all  positive.  However,  they  are 
negligible  at  all  except  the  lowest  ability  levels.  This  tends  to  con¬ 
firm  our  choice  of  the  ?  scale  of  ability  rather  than  the  6  scale 
for  many  purposes. 
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TABLE  3 

A 

Standard  Error  of  z  and  of  c  * 
and  Statistical  Bias  of  £ 


e_ 

1 

f  Var  z 

/  Var  ? 

m. 

3.5 

.981 

.014 

.014 

.00045 

3.0 

.966 

.019 

.018 

.00052 

2.5 

.937 

.024 

.023 

.00064 

2.0 

.891 

.031 

.029 

.00059 

1.5 

.812 

.037 

.035 

.00021 

1.0 

.715 

.042 

.040 

.00026 

0.5 

.608 

.045 

.042 

.00061 

0 

.506 

.046 

.043 

.00061 

-0.5 

.416 

.047 

.042 

.00062 

-1.0 

.344 

.046 

.038 

.00061 

-1.5 

.291 

.045 

.037 

.00085 

-2.0 

.254 

.044 

.033 

.0014 

-2.5 

.227 

.042 

.029 

.0020 

-3.0 

.211 

.042 

.025 

.0024 

-3.5 

.199 

.041 

.021 

.0026 
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As  a  matter  of  incidental  interest,  for  selected  values  of  true 
score  Table  3  compares  the  standard  error  (35)  of  the  maximum-likeli- 

A 

hood  estimator  c  with  the  standard  error  (33)  of  the  unbiased  estimator 
z  (proportion-correct  score) .  There  is  little  difference  in  accuracy 
between  the  two  estimators  for  ;  >_  .5  .  At  low  tree-score  levels,  the 
maximum-likelihood  estimator  is  much  better  than  the  proportion  of 
crrrect  answers. 


2  2 

2.  Unbiased  Estimation  of  sQ  ,  of  s  ;  Test  Reliability 


2  2 

The  symbols  s  and  s  are  used  for  the  sample  variance  of 
0  C 

and  of  r,  across  the  N  examinees  in  the  sample: 


*  N  «  -  N  n 

Sfl :  -  £  z  p*  -  (  *  z  or 

o  N  ,  a  N  .  a 

a«l  a* I 


The  maximum -likelihood  estimators  of  s2  and  a 2  are  s2  and  s2  , 

9  C  9  £ 

*  A 

the  sample  variances  across  examinees  of  0  and  of  {  , 


2.1  Asymptotically  Unbiased  Estimator  of  q‘ 


Assume  that  c-ir  examinees  are  a  random  sample  of  N  from  some 

2 

population.  Denote  by  aQ  the  population  variance  of  8  .  Then  Ns2/(N-1) 

2  2  8 
is  an  unbiased  estimator  of  oQ  .  since  s0  is  unobservable,  our  first 

task  is  to  find  a  function  of  0  that  is  an  asymptotically  unbiased 
2 

estimator  of  a 
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By  the  formula  for  the  variance  of  a  sum  v«  have 


2  2  2  2 

o*  £  a.  ,  2  cr  +  a  +  2o. 

e  e+x  e  x  ex 


(40) 


2 

where  o  denotes  a  variance  across  all  examinees  in  the  population 
and  o.  is  the  corresponding  population  covariance.  By  a  well- 

“X 

known  identity  from  the  analysis  of  variance 


5  Vx|« 


+  o 


f(x|e) 


(41) 


where  6.  denotes  an  expectation  across  all  examinees  in  the  population. 
Similarly, 


°0X  ~  '^(xlB)  "  ^ 

Substituting  (41)  and  (42)  into  (40),  transposing,  writing  r  £^(x|e) 
as  in  (28),  and  dropping  the  subscript  from  for  convenience,  we 

have 


as 


-  2a 


0B 


'  Vx|0 


(43) 


Since  by  (28)  B  is  of  order  n""1  ,  its  variance  is  of  order  n” 
2 

so  a  can  be  neglected  in  (43).  Since  Section  1  deals  with  a  single 

O 

2 

fixed  examinee,  the  symbol  a  in  (43)  has  the  same  meaning  as 

X  |  u 

Var  0  in  (21): 


x|e  1(0) 


+  o(n-1) 


■  .r-; 
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2  “1 

where  I  5  1(6)  is  given  by  (16).  Since  ox|0  is  of  order  n  ,  the 

A 

effect  of  replacing  0  by  0  on  the  right  is  neglibible: 


Vx|9 '  *.  ^ +  0("'1)  • 

A 

By  similar  reasoning,  we  may  replace  in  (43)  by  a**  where  B 

A 

is  defined  by  (27)  with  0  replaced  by  0  .  The  result  of  these 
approximations  is  that 


°o  ■  °e -  2oa '  7^7 +  °(n’1)  • 


(44) 


A  useful  estimator  of  o!"  can  be  calculated  from 


->2  _  N  2 

Oft  I  »•  1  “ 


N 


2N  _  I  z 


e  =  N  -  1  3§  -  N  -  1  80B  n  a:x  I(§  )  * 


(45) 


where 


N 


N 


1  *"  A  A  1  M  *  1 

*  -  —  E  0  B_  -  (  £  l  6 _)(  ~  £  B_> 


eB  ~  N  a.i  »  » 


N  *  a  N  -  a’ 
a-1  a-1 


and  Bfl  is  given  by  (27)  with  0  replaced  by  ©a  .  If  we  wish  to 

2 

estimate  the  sample  variance  of  ability  sfl  rather  than  the  population 

0 

variance  a 2  ,  we  can  use 

o 


12  _  2  N  ^  1  J  _1 

e  =  e  '  2s0B  1  — 


N  a-l  1(0  ) 

a 


(46) 


Unbiased  Estimators 


* 


22 

The  second  and  third  terms  of  (44)  are  of  order  n~*  ,  an  order 
of  magnitude  smaller  than  the  first  term  but  larger  than  the  neglected 

a  a 

terms.  The  covariance  of  8  and  B  is  usually  positive,  as  can  be 

a 

readily  seen  from  Table  2.  Since  1(8)  is  necessarily  positive,  it 

2  2 

appears  that  usually  o  <  o«  ,  an  inequality  that  is  frequently  assumed 
without  proof.  It  is  not  clear  whether  this  inequality  is  necessarily 
true. 

2.2  The  Reliability  of  e 

Consider  the  parallel-forms  reliability  coefficient  p*‘ ’  , 

60 

A  A 

the  correlation  between  scores  8  and  8"  on  two  parallel  tests. 

For  present  purposes,  two  tests  are  parallel  when  for  each  item  in  one 
test  there  is  an  item  in  the  other  test  with  the  same  item  response 
function.  Let  us  estimate 


68  ' 
°e°@ 


(47) 


from  a  single  test  administration  by  substituting  asymptotically  unbiased 
estimators  of  the  numerator  and  of  the  denominator  into  (47). 

As  in  (41), 


°§8«=  V§8'|e  +  °l(e|eM(§'|8> 


(48) 
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Priority  in  obtaining  this  result  belongs  to  Sympson  [Note  1]. 

Replacing  population  values  on  the  right  by  the  corresponding  sample 
statistics t  we  have  a  sample  estimator  of  the  parallel-forms  reliability 

A 

coefficient  of  6  : 


a-1  1(6  ) 
a 


(52) 


Since  6  is  neither  unbiased  nor  uncorrelated  with  e  ,  we 
should  not  expect  the  usual  reliability  formulas  of  classical  test 
theory  to  apply.  A  similar  but  not  identical  case  is  discussed  in 


•"jrtw  .v  j» 


/# 

i 
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Lord  and  Novick  (1968,  Section  9.8).  Thus  pgg ,  ,  p|Q  ,  and 
2  2 

oQ/og  are  not  interchangeable  definitions  of  reliability.  Since 
correlational  measures  are  hard  to  interpret  in  the  absence  of 
linearity  and  homoscedasticity ,  we  will  not  now  push  this  investigation 
of  reliability  further. 

2.3  Corresponding  Results  for  True  Score 

By  the  same  reasoning  used  to  obtain  (44)  we  have 

2  _  2  „  ,2  2 

<,  ?  5,B(c)  0  C I C  B(c) 


A  2 

°l  '  ’  se  ^  +  0(n‘1)  • 


A  useful  estimator  of  can  be  calculated  from 


(53) 


'2  -  N  2  2N 

O  „  —  \  T  1  Sa 


1  N  c' 

-  —  j  a 


t  -  N  -  1  “c  N  -  1  •?,»(?)  -  N  ail  7^ 

a 


(54) 


To  estimate  s  ,  we  can  use 

C 


-2  _  2  .  N-l  *  i;! 

5  C  C,B(C)  N  a-l  I(§  ) 

a 


(55) 


Unbiased  Estimators 


25 


As  In  (50)  -  (52)  we  have 


•irvVii,  • 


1  r*^  _1 

p::»  “  1  -  ~2  s.  +  0(n  X) 

"  o\  8  1(e) 


(56) 

(57) 


=  1  - 


N  -  1 

„2  2 

N  s« 


a-1  I(e  ) 
a 


(58) 


2.4  Numerical  Results  for  True  Scores 

At  moderate  ability  levels,  (28)  provides  adequate  but  usually 

A 

neglible  corrections  for  bias  in  9  .  Experience  shows  that  at 
very  low  ability  levels,  the  usual  test  length  (  n  )  of  50  or  100 
items  is  not  long  enough  for  the  asymptotic  results  of  (28)  to  apply. 
For  example,  an  examinee  whose  true  0  Is  -3  may  easily  obtain  an 
estimated  ability  0  of  -30  or  of  -»  .  For  sufficiently  long  tests, 
such  extreme  values  of  0  would  have  negligible  probability,  but 
with  the  usual  values  of  n  ,  equation  (28)  is  totally  inadequate  for 

A 

correcting  0  for  bias  at  low  ability  levels. 

This  sair  5  difficulty  carries  over  to  the  unbiased  estimation 
2 

of  oQ  using  (46).  Since  all  ability  levels  are  involved  in  (46), 
the  formula  is  useless  in  practice  for  any  group  that  contains  even 
a  few  low-ability  examinees.  Fortunately,  this  difficulty  does  not 
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carry  over  to  the  estimation  of  ability  on  the  true-score  (  £  ) 
scale. 

The  hypothetical  SAT  Verbal  Test  of  Tables  1-3  was  administered  to 
a  typical  group  of  2995  hypothetical  examinees.  The  bias  in  £ 

A 

was  estimated  for  each  examinee  and  a  corrected  £  obtained  from 

(51): 

A  A  A 

corrected  £  =  £  -  B^(c) 

A 

In  a  few  cases  where  the  corrected  £  would  have  been  below  the 
chance  level  E^c^  *  the  corrected  £  was  set  equal  to  E^c^  • 

The  mean  of  the  2995  true  £  used,  to  generate  the  data  was 

A 

.5280,  the  mean  of  the  uncorrected  £  was  .5294,  the  mean  of  the 

A 

corrected  £  was  .5288.  Thus  the  correction  was  in  the  right 
direction,  but  not  large  enough.  The  uncorrected  mean  £  was  already 
so  accurate  as  to  leave  little  room  for  Improvement. 

Next,  (55)  was  used  to  estimate  s^ .  The  true  value  was 

A 

s  ■  .1610  ,  the  standard  deviation  of  £  was  sp  ■  .1660  ,  the  cor- 
rected  estimate  from  (55)  was  ■  .1614  .  The  correction  worked 
very  well  here. 

A 

The  parallel-forms  reliability  of  £  was  estimated  from  (58) 

A 

to  be  p  ”  .9420  .  We  have  no  'true'  value  against  which  this  can 
be  compared,  but  the  estimate  seems  a  reasonable  one.  The  Kuder- 
Richardson  formula-20  reliability  of  number-right  scores  for  these 


data  is  .9275. 
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It  should  be  remembered  that  both  the  formulas  and  the  numerical 
results  in  this  report  apply  in  situations  where  the  Item  parameters 
are  known.  These  formulas  may  be  satisfactory  for  situations  where 
the  item  parameters  have  been  estimated  from  large  groups  not  containing 
the  examinees  whose  ability  estimates  are  to  be  corrected  for  bias. 

These  formulas  will  not  be  adequate  for  situations  where  the  item 
parameters  and  ability  parameters  are  estimated  simultaneously  from 
a  single  data  set. 
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Reference  Note 
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